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PREFACE 

This report describes part of a comprehensive and continuing program 
of research in multispectral remote sensing of the environment from air- 
craft and satellites and the supporting effort of ground-based researchers 
in recording, coordinating, and analyzing the data gathered by these means. 
The basic objective of this program is to improve the utility of remote 
sensing as a tool for providing decision makers with timely and economical 
information from large geographical areas . 

The feasibility of using remote sensing techniques to detect and dis- 
criminate between objects or conditions at or near the surface of the earth 
has been demonstrated. Applications in agriculture, urban planning, water 
quality control, forest management, and other areas have been developed. 

The thrust of this program is directed toward the development and improve- 
ment of advanced remote sensing systems and includes assisting in data 
collection, processing and analysis, and ground truth verification . 

The research covered in this report was performed under NASA Contract 
NAS9-14123. The program was directed by R. R. Legault, Director of ERIM's 

Infrared and Optics Division and an Institute Vice-President, and J. D. 
Erickson, Head of Information Systems and Analysis Department. The 
institute number for this report is 109600-16-F. 

The authors wish to acknowledge the administrative direction provided 
by Mr. R. R. Legault and Dr. Jon D. Erickson and the technical assistance 
given by Mr. W. W. Pillars. Ms. D. Dickerson, L. Parker, and G. Sotomayor 
are thanked for their secretarial assistance. 
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1 

INTRODUCTION 

With the development of satellite multispectral scanners (MSS) it has 
become possible to gather data from large areas. This data collection 
effort has the potential of providing timely information concerning the 
state of world-wide crop production. In order that this potential is 
realized it is necessary to find methods of processing the data in a timely 
and cost effective manner. 

A major stumbling block in the way of achieving cost effective pro- 
cessing is the requirement of large amounts of ground information. This 
ground information is required to train the computer to recognize different 
crop types. Because of variations in measurement conditions when the data 
is collected the computer must be retrained on a regular basis. The crop 
signatures are not constant in either time or space. The need to retrain 
the computer requires new ground information which is both costly and time 
consuming . 

The first objective of this investigation is to develop signature 

extension techniques which will allow the crop signatures to be updated or 

corrected for variations in the measurement conditions so that signatures 

& 

derived from the training data set (TDS) can be successfully used for 
recognition on a different, removed, recognition data set (RDS) . This 
objective can be accomplished using the following approaches: 

1) Extract signatures from the TDS and then find a transformation 
which will map those signatures onto the RDS. The transformation 
will correct for the differing measurement conditions of the two 
data sets . 

2) Preprocess the two data sets to remove the effects of the varying 
measurement conditions and then extract signatures from the TDS 
and apply them to the RDS . 


A 

Appendix IV lists the abbreviations used in this report. 
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We will call these two signature extension methods Type 1 and Type 2, 
respectively . 

In this report we will examine the sources of variation in the data 
and two Type 1 and two Type 2 methods for correcting for those variations . 
The Type 1 methods are the ASC and MASC algorithms which are discussed in 
sections 5.1 and 5.2 respectively. The Type 2 methods are Ratios and 
RADIFF. These are discussed in sections 6.1 and 6.2. In section 7 we will 
discuss the results of an experiment to determine the possible effects on 
recognition of variation in atmospheric state and scanner view angle. 

While the signature extension methods which we examine in this report 
are applied to single pass data the general approaches can also be used to 
extend multitemporal signatures. In the future some of the methods 
developed here will be extended for use on multitemporal data sets. 

The second objective of this study is to investigate methods of 
defining training fields without in situ ground information. If training 
fields can be identified without the use of ground information then locally 
derived signatures may be used for the recognition of every data set. In 
section 8 we discuss some preliminary investigations into this problem. 
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2 

SUMMARY 

Investigations into the sources and nature of between-scene data 
variations were carried out. The variations in the data were seen to be 
due to three types of variations in the measurement conditions. These 
were : 

1. Instrumental, 

2. Environmental, 

3 . On the Ground Changes , 

These variations in measurement conditions were responsible for multiplicative 
and additive variations in the crop signatures when going from one data set 
to another. 

In order to correct for varying measurement conditions four signature 
extension methods were developed and tested. The four were: 

1 . ASC , 

2. MASC , 

3. Ratio of Spectral Bands, 

4. RADIFF. 

Each method was, in theory, capable of correcting for a subset of the possible 
variations in measurement conditions. 

The four methods were tested on LANDSAT-l data that was collected for 
the C1TARS project. Of the four methods the ASC and MASC algorithms performed 
the best. MASC showed the most promise as a signature extension technique and 
as a tool for further investigating the nature of the inter-scene data 
variations . 
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Initial investigations into defining training fields without the aid 
of ground information have been carried out. These investigations have been 
based on the attempt to define regions of the data space which are occupied 
by single crop types . To aid this attempt two methods of transforming a 
region from one data set's space to another data set’s space were developed. 
The methods are: 

1 . Overlay Method , 

2. Method of Affine Transformations. 

The work in this area is at too early a stage of development to be able to 
make any judgements as to the best method of defining the regions in the 
data space that are associated with particular crop types. 


10 



Term 

kmmmi 


FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 


3 

SOURCES OF DATA VARIABILITY 

In order that we can develop methods to correct for variations in 
the data between the TDS and RDS we must investigate the source of those 
variations . 

There are a number of factors which can be sources of variation in 
scanner signals. Some of these sources are listed below, where we have 
divided them into three categories: instrumental sources, environmental 

sources, and scene related sources of variation. 

SOURCES OF VARIATION IN MULTISPECTRAL SCANNER SIGNALS 

A. Instrument 

Scanner Electronics and recorder instabilities 
Gain changes 

Nonuniform angular responsivity 

B . Environment 

Changes in irradiance 

Changes in atmospheric transmittance 

Changes in atmospheric path radiance 

C . Scene 

Geometric effects 
Reflectance effects 

Instrumental sources are associated with the mechanics, optics, and 
electronics of the multispectral scanner. Included in this category are 
gain changes, non-uniform angular responsivity, and other recorder and 
electronic instabilities. Since many of these effects are deterministic, 
they can be eliminated from the data during an initial data preparation 
stage . 

Environmental sources of variation include changes in the magnitude 
and spectral make-up of the irradiance at ground level, changes in atmos- 
pheric transmittance, and changes in path radiance. Changes in irradiance 
result from changes in the atmospheric state as well as from solar positional 
changes that occur between the times the data sets are collected. 
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Atmospheric transmittance and path radiance will also change as the 
atmospheric state changes . These quantities are also functions of scan 
angle since they depend on the path length from the ground to the scanner. 

In Fig. 1 we see the variation of path radiance, as calculated using the 
ERIM Radiative Transfer Model [1] , for different atmospheres (as repre- 
sented by visibility) . It is clear that path radiance can vary considerably 
with changing atmospheric state, up to 37% for the visibilities shown in 
Fig. 1. Shown in Table 1 is the effect of scan angle on both the path 
radiance and total radiance received by a scanner. The change in path 
radiance over a range of scanner view angles, from +6° to -6° relative to 
nadir, is greater than 18%. The change in total radiance, over the same 
range of view angles, can be as large as 10%. 

The in-scene effects are of two types. The first effect is the geo- 
metric variations due to sun-angle and bidirectional reflectance. These 
will cause the amount of radiation reflected in a particular direction to 
depend on time of day and position of the target in the scene. The other 
in-scene effect is variation in target reflectance. This may be caused by 
differences in moisture content of the soil or soil type. Also differences 
in irrigation and fertilization or crop vigor will cause variation in the 
crop reflectances. 

To see how these effects combine to affect the variability of the 
MSS signals we write the equation for the signal recorded by the scanner 
in channel i for crop a , 

S< 1 > = +G< i >L< 1) . (1) 

a a p 


The instrumental effects are contained in the gain term G 


(i) 


while the 


atmospheric effects are contained in the irradiance E 


(i) 


the transmittance 


,(i) 


and the path radiance L 


(i) 


The in-scene effects are contained in 


the reflectance p 


(i) 


Thus, the effect of variations in atmosphere and 


^Robert E. Turner, "Radiative Transfer in Real Atmospheres", ERIM 
Report No. 190100-24-T, December 1973. 
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FIGURE 1 . DEPENDENCE OF PATH RADIANCE AS A FUNCTION OF WAVELENGTH 
FOR SEVERAL VISIBILITIES. Altitude = 910 km. Solar 
Zenith = 62°, Green Vegetation Target on Green Vegetation 
Background. (Calculation based on ERIM Radiative Transfer 
Model) 
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TABLE 1. SCAN ANGLE EFFECTS ATTRIBUTABLE TO THE ATMOSPHERE 


Azimuth 

Scan Angle 

Spectral 

Radiances* 

2 

(mW/cm ’sr-ym) 

Relative 
to Sun, 

* 

Relative 
to Nadir 

• 

o 

ii 

.< 

55 ym 

A = 0.75 urn 

0 

Path 

Total 

Path Total 

u> 

00 

0 

(-) 6° 

2.51 

4.70 

0.98 2.78 


0° 

2.71 

4.90 

1.06 2.86 

218° 

6° 

2.98 

5.17 

1.17 2.96 




Percent 

Change 

From Nadir 

(0=0°) Value 



A - 0 

.55 ym 

A = 

0.75 ym 

<j> 

6 

Path 

Total 

Path 

Total 

38° 

(-) 6° 

-7.3 

-4.2 

-7.2 

-2.8 


0° 

0 

0 

0 

0 

218° 

6 ° 

10.2 

5.5 

10.1 

3.7 



Percent Change From One 

Side of 

Nadir To Other 

Scan Angle 
Change 

A = 

0.55 ym 

A = 

0.75 ym 

Path 

Total 

Path 

Total 

-6° to + 6° 

18.9 

10.1 

18.7 

6.6 


♦Target Reflectance = Background Albedo = 8% 

Solar Zenith Angle = 39° 

Optical Thickness of Atmosphere = 0.3812 for 0.55 ym 
and 0.2854 for 0.75 pm. 
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instrumented response is to produce both multiplicative and additive 
variations in the recorded signal. The in-scene variations will produce 
multiplicative variations in the scanner signal. 

In this report we investigate a number of approaches to remove these 
variations. The ASC method produces an additive signature correction and 
is thus primarily concerned with variations in path radiance . The MASC 
algorithm produces both an additive and multiplicative signature correction. 
It is thus potentially capable of correcting for all of the variations we 
have discussed, however, variations in the reflectances can only be corrected 
for in an average way. The Ratios of Spectral Bands method can correct for 
all multiplicative variations if they are correlated between channels and 
if the path radiances are negligible. The RADIFF method removes from the 
Ratio method the restriction that the path radiance be negligible. 
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4 

DESCRIPTION OF DATA 


The data sets used in this study were originally designed for the 

CITARS [2] project. They consisted of a number of 8 km x 32 km sites in 

* 

Indiana and Illinois collected by LANDSAT-1 during the 1973 growing 
season. In particular the data sets Fayette Co., Illinois, June 10, 

June 11, and August 21; Shelby Co., Indiana, June 8; and White Co., 

Indiana, August 21 were employed. 

For the June period the Fayette, June 11 (F6-11) data set was arbi- 
trarily defined as the training data set (TDS) . The Shelby, June 8 (S6-8) 
and Fayette, June 10 (F6-10) were chosen as the recognition data sets (RDS) . 
For the August period White, August 21 (W8-21) was defined to be the TDS 
while Fayette, August 21 (F8-21) was chosen as the RDS. 

For the CITARS study certain fields of each site were chosen for 
training and others were designated as test fields. In our TDS, signatures 
were extracted from the CITARS designated training fields while all fields 
in the RDS (i.e., both training and test) were used to test the recognition 
performance of the various signature extension methods. The results of the 
recognition experiments are described in terms of field-center pixel recog- 
nition and confusion of the major crops. Field-center pixel recognition 
performance was used to evaluate the techniques rather than crop proportion 
estimation because the objective of signature extension is the mapping of 
class spectral information from one data set to another. Proportion esti- 
mation depends on the correct recognition of impure boundary pixels and is 
therefore a function of the type of classifier used. During June the major 


^ 2 ^W. A. Malila , D. P. Rice and R. C. Cicone, "Final Report on the CITARS 
Effort by the Environmental Research Institute of Michigan", ERIM Report No. 
109600-12-F, February 1975. 

*The LANDSAT-1 MSS bands are numbered 4, 5, 6, 7, in this report we have 
renumbered them as channels 1,2, 3, 4. 
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crop was wheat and the results are reported as percent correct wheat 
recognition and percent correct "other" recognition. Actually "other" 
crops were considered correctly recognized if they were classified as 
anything other than wheat. Using this definition of "correct other", we 
could leave everything unclassified and have 100% "correct other". For 
this reason the "percent other correct" results may be somewhat misleading 
in terms of evaluating the value of a particular signature extension method. 
During the August period corn and soybeans were the major crops . The 
"percent other correct" has the same meaning as for the June period. 

One change was implemented in the June Fayette data sets. The original 
test field designated 29-29 was labeled as being all wheat. Investigations 
into the datum values as well as the photo-imagery led to the conclusion 
that field 29-29 was in fact three separate fields. The middle field was 
determined to be wheat and the coordinates of field 29-29 were adjusted to 
include the central ten pixels. 

The signatures, for this investigation, were extracted from each of 
the training fields in the TDS and then were combined on the basis of 
ground information concerning the crops of each training field. Thus the 

2 

field signatures from every wheat training field were combined, using a x 

2 

rejection test, to form a wheat crop signature. The x rejection test was 

2 

based on a final x distance rejection threshold corresponding to a .001 

probability of false rejection under the assumption of normality and four 

degrees of freedom. This resulted in rather large signature covariances 
2 

and a smaller x level may have produced better signatures. As it was, two 
wheat modes remained apparent and a second wheat signature, designated 
"wheat 2", was produced. The signature set used during the June period 
included: wheat, wheat 2, water, trees, bare soil, and weeds. 

For the August period, the signatures from W8-21 were formed in a 

2 

similar manner to the F6-11 signatures. The only difference is in the x 

rejection level used when combining the individual field signatures. The 
2 

final x distance rejection threshold corresponded to a .01 probability of 
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false rejection. The August signatures used were: corn, soybeans, 

pasture, quarry, and trees. 

Recognition was performed on the RDS using the ERIM Linear 

2 

Classifier [3] . A null test threshold corresponding to a .001 x proba- 
bility was used. Thus each pixel was classified into one of (n+1) bins 
where n was the number of signatures. Non-major crops put in the unclassi- 
fied bin were considered to have been correctly classified. 


^R. B. Crane, W. Richardson, R. H. Hieber and W. A. Malila, "A Study 
of Techniques for Processing Multispectral Scanner Data", ERIM Report No. 
31650-155-T, September 1973. 
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5 

TYPE 1 SIGNATURE EXTENSION METHODS 


As described in the introduction (section 1) a Type 1 method is one 
which produces a mapping of the TDS signatures onto the RDS . This mapping 
may account for some or all of the inter-scene variability which exists 
between the TDS and RDS. In this report we investigate two Type 1 methods: 
ASC (additive signature correction) , and MASC (multiplicative and additive 
signature correction) . Both methods have performed reasonably well on the 
data sets tested. 

5.1 ASC 

The equation for the signal recorded by the scanner in channel i for 
crop a is (see discussion after equation (1)): 

S U) = G (i) E (i) T (i) p (i) +G (i) L (i). 

a . a p 

If we use subscripts 1, and 2 to denote the TDS and RDS, respectively, the 
crop signature for the RDS can be related to the TDS crop signature by 


s (i) = A d) s (i) + B (i) . 

a2 a al a 


We have defined 


(i) 


r (i)„(i)_(i) (i) 
- G 2 E 2 X 2 p a2 

" r (i) F (i) T (i) (i) 
G 1 E 1 T 1 P al 


and 




(i) T (i) 
2 



(i) T (i 

1 pi ' 


( 2 ) 


(3) 
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While , T^ , and l/^ all depend on atmospheric conditions, it is 

apparent that different L^'s f° r two data sets amounts to a change in the 
reference level with respect to which the radiance measurements of the 
target are made. Thus for signatures obtained from the TDS the means of 
the various crops may be translated up or down compared to the crop means 
in the RDS . If we assume that and are the same for 

both data sets, or more precisely that 


A 


r (i) F (i) T (i).(i) 

(i) = G 2 B 2 T 2 P ct2 
a (i) (i) (i) (i) 

G 1 E 1 T 1 P al 


= 1 , 


(4) 


then the TDS signatures may be extended to the RDS by finding the appro- 
priate translation. That is 


s (i) = s (i) + B (i) 

a2 al 


note that under the assumption of equation (4) is independent of the 

crop type a . 

To examine the validity of equation (4) we plot the F6-11 signature means 
versus signature means obtained from F6-10 (Figure 2) and versus signature 
means from S6-8 (Figure 3). In figures 2 and 3 the dashed lines are the 
equation 


(i) = T (i) 

ctl a 2 


the solid line is a lease square fit of the equation 


(i) 

J al 


= L 


(i) 

’a2 


- B 


(i) 


to the data. We see that for F6-10 equation (4) holds quite well but for 
$6-8 the assumption is not as good. From the figures we can estimate the 
amount of translation required to extend the F6-11 signatures to F6-10 and 
S6-8. The values of obtained from the fitted lines in the figures 

are listed in table 2. 
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Figure 3* 


Plot of F6-11 Signatures vs S6-8 Signatures. ■ -Trees 
• -Wheat, A -Bare, -4-~Weeds, l/^= L^ 1 ' ) - 
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TABLE 2. TRANSLATION VECTORS FOR EXTENDING F6-11 SIGNATURES 
TO F6-10 AND S6-8 . Values from Figures 2 and 3 


Channel i 

F6-11 ■+ F6-10 

F6-11 ■* S6-8 
B< 1 > 

1 

+1 

+1 

2 

+1 

+1 

3 

+1 

+2 

4 

+1 

+2 


The values of B*'"^ in Table 2 are not of any use for signature 
extension since it was necessary to use ground information from F6-10 and 
S6-8 to obtain them. This, of course, is not the objective of signature 
extension. A different method must be found to estimate the translation 
vectors . We recall from equation (3) , 


B 


(i) 


(i) T (i) 

2 


,(i) T (i) 
1 L 


» 


(using eq . (2) and (4)), that the translation vector is just the difference 

between the recorded path radiance of the two data sets. Thus if the gains 
and path radiances were known the vectors could be calculated. To find the 
path radiances, a model, such as the ERIM Radiative Transfer Model, could 
be used with measured atmospheric inputs. 

In the absense of atmospheric inputs it is necessary to get information 
concerning the relative magnitudes of the path radiances from the two data sets. 
One method of doing this would be to search both the TDS and RDS for the 
darkest objects in each channel. However, even if in one data set the darkest 
object had zero reflectance, so that all of the radiation received by the 
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scanner was path radiance, in the other data set no such object may exist* 
In order to try to avoid the problem that isolated dark objects may corre- 
spond to different targets in the two data sets, the dark objects whose 
scanner values lie at the bottom of the histogram continuum are used. By 
choosing the dark objects in this way we are using more information than 
is contained in the data value alone. The dark object is determined by its 
relationship to the majority of other targets in the scene. To illustrate 
this we have constructed a hypothetical histogram of the scanner values in 
a single channel. In Figure A is shown the lower portion of the histogram. 
The value which is chosen to represent the dark object from this histogram 
is 13. Denoting the dark object by DO we empirically estimate the trans- 
lation vectors by 

B (l) = D0^ X) - DO^ . 

Using this method for extending from F6-11 to F6-10 and S6-8 yields 
the translation vectors listed in Table 3. 


TABLE 3. TRANSLATION VECTORS OBTAINED USING DARK OBJECT SEARCH 


Channel i 

F6-11 -> F6-10 
B (i) 

F6-11 S6-8 

B^ 

1 

2 

A 

2 

0 

A 

3 

1 

-1 

A 

0 

0 


The results of using the ASC method to extend signatures for recognition 
are shown in Table A. Also listed, for comparison, are the results of using 
untransformed (UT) signatures . 
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FIGURE 4. HYPOTHETICAL HISTOGRAM OF SCANNER VALUES 
SHOWING HOW ASC DARK OBJECTS ARE CHOSEN 
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TABLE 4. COMPARISON OF RECOGNITION USING ASC TRANSFORMED 
AND UNTRANSFORMED SIGNATURES 


Training Data 
Set 


Signature 

Extension 

Method 


Recognition 
Data Set 


Center-Field Pixel Recognition 


Fay, 11 June 
Fay, 11 June 
Fay, 11 June 
Fay, 11 June 


Fay, 10 June 
Fay, 10 June 
She , 8 June 
She, 8 June 


Correct Wheat 
64.0% 
89.5% 
41.5% 
84.9% 


Correct Other 
89.3% 
84.6% 
95.9% 
86.3% 


White, 21 Aug 


Fay, 21 Aug 


Correct 

Corn 


Correct 

Soy 


10 . 0 % 


Correct 

Other 

70.9% 


White, 21 Aug 


Fay, 21 Aug 
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As can be seen from Table A the ASC method, in two of the cases, 
significantly improved major crop recognition. It should be recalled that 
the "Correct Other" category includes targets which may not have been 
recognized as the correct class but which weren’t recognized as wheat. 

In the next section we will investigate a Type 1 method which does 
not rely on the assumption of equation (A) . 

5.2 MASC 

MASC is an algorithm which provides a mapping of signatures from one 
data set to another. It is potentially capable of correcting the differences 
in signatures caused by variations between the two data sets of: 

1. Atmospheric and solar illumination conditions. This includes 
differences in sun-target-scanner geometries . 

2. Electronic gain and other instrumental parameters, 

3. And, in an average way, soil type and moisture. 


In order to discuss the MASC algorithm we must see how these three 
sources of variations affect the datum values recorded by the scanner. 
To do that we will begin with a review of the sources of data variation. 

Consider first the radiance received by the scanner from the "mean" 
of crop a in channel i. 



Mi) fi) U) , T (i) 

E v T p + L : 
a p 


is the irradiance incident on the target in channel i, T^ is the 
transmittance of the radiation through the atmosphere from the target to the 
sensor, and is the scattered path radiance in channel i. For two data 

sets, 1 and 2, a difference in atmospheric and solar illumination conditions 
will result in different values for E 


^ T^ , and 

P 


Differences in soil 


27 



2p 


FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 


conditions may also result in different values for the reflectance 
for the two data sets. If we further allow for a change in gain, G^) , 
between the two data sets then the signals actually recorded for the same 
crop from two different data sets are: 



, (1 >L (1) 
1 a 


(±) (±) (±) <±) 

G 1 E 1 T 1 p a 


+ 


_(±)_ (i) 
G 1 L pl * 


(5) 


(i) _ (i) (i) 

2 " G 2 L a2 


_ „(i) (i) T (i) (i) 
" z E 2 T 2 p a2 


+ G 


(i) T (i) 

2 P 2 * 


( 6 ) 


If we wish to extend the signatures extracted from data set 1 to 
data set 2, in a way which will yield accurate recognition, then we must 
find a mapping such that 





+ B 


(i) 


(7) 


By substituting equations (5) and (6) into equation (7) it is found 
that 


and 



(i) (i) (i) (i) 
G 2 E 2 T 2 P a2 
(i) (i) (i) (i) 
G 1 E 1 T 1 P «1 


( 8 ) 


B (i) - - A (1) Gl (i) L (E) 

a 2 p2 a 1 pi 


(9) 


Equation (7) defines a multiplicative and additive signature 
correction (MASC) which maps the signature for crop a in the TDS onto 
the signature for crop a in the RDS . What is necessary for successful 

signature extension is to obtain the MASC parameters A^ and B^. 

a a 
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The two parameters and contain the effects of all measure- 

ment variables including target reflectance. If the distribution of 

reflectances for target ot is different for the two data sets then the MASC 

mapping will, in general, not be unique. The two MASC parameters will have 

an explicit dependence on the target type a. In what follows we will 

assume that the distributions of reflectances for the various targets are 

approximately the same for the two data sets. In this way we are able to 

employ a unique mapping using the parameters A^ and . If the above 

assumption does not hold then we will define a unique mapping by the 

parameters A^ and B^ which are the averages over a of the parameters A ^ 

andB^, 
a 7 


where 


(i) 


e TV 1 ^ 1 ’ 

a a 


:i) ^E> (i 

Hi a 


>B (i ) 

a 



a 


Thus equation (7) becomes 


= A^S^ + . 

a,2 al 


( 10 ) 
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So far everything we have done is formal and of little use unless 
the MASC parameters can be found for the data sets of interest . If the 
gain and target reflectances were the same for both data sets, and 

could in theory be obtained by making appropriate atmospheric 
measurements at the time of data collection. Even this, however, may not 
be practical for timely large area inventories . 

What is required is equivalent "looks" at the two data sets . In this 
way information concerning the relative natures of the data sets can be 
obtained without resort to ground observations. One method of obtaining 
this information quantitatively is with the use of unsupervised clustering. 

The MASC algorithm which has been developed to obtain A^ and uses 

an ERIM clustering routine [4]. Any good clustering routine should work 
provided it be applied in exactly the same way to both data sets. 

The clustering routine is applied separately to both data sets. (It 
isn't necessary to cluster over every point in the data set, a sampling, 
e.g. over every other scan line would be sufficient.) The output from the 
ERIM clustering routine is a set of clusters. The number of pixels in each 
cluster is given in the output. The clusters are represented by multi- 
variate Gaussian distributions. Only those clusters are retained which 
contain more than 1% of all the pixels clustered. 

These clusters are unidentified for both data sets; no ground truth 
has been used. In order to use these clusters to obtain the MASC parameters 
of equation (10) it is necessary to find a correspondence between the 
individual clusters of each data set. To form this correspondence we order 
the clusters of each data set on the basis of their means in one of the 
channels. Other, perhaps better, methods of forming this correspondence 
are in the process of being programmed. In the present implementation the 
channel chosen for this ordering is the channel with the largest range of 
values. After both sets of clusters have been ordered in this way a one to 
one correspondence is made — the number one cluster of data set one is 
_ - — — 

H. M. Horwitz, J. T. Lewis and A. P. Pentland, "Estimating Proportions 
of Objects From Multispectral Scanner Data", ERIM Report No. 109600-13-F, 

April 1975, Section 4.4 and Appendix E. 
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matched up with the number one cluster of data set two, etc. Using the 
means of the Gaussian distributions representing the clusters as points 
defining a line 

c (i) , £<i> 0 u> + , 

where the end ^ are the set of cluster means in channel i for data 

set 2 and data set 1 respectively; a regression routine is used to deter- 
mine the parameters and B^^. These parameters are then applied to 

the signatures of the TDS, as in equation (10), and the resulting trans- 
formed signatures can be applied to the RDS . 

The basic assumption behind this MASC algorithm is that the two data 
sets contain the same types of targets, although not necessarily in the 
same proportions. If the correspondence of unidentified clusters in the 
manner described is to have any validity this assumption must hold at 
least approximately. Forming such correspondences would make little sense 
if one data set was agricultural and the other was woodlands, or urban. 

The test of any signature extension method lies in its performance on 
real data. The results of applying MASC to our data sets are listed in 
Table 5 which gives the data set from which the signatures were derived 
and the data set to which they were extended. The results are for field- 
center-pixel recognition.* Also listed are the results of applying untrans- 
formed (UT) signatures. 

We see that the MASC algorithm results in significant improvement in 
major crop recognition for all three cases. Also there Is very little change 
in "Correct Other", 

Both Type 1 methods we have investigated have been found to be potentially 
viable signature extension methods for large area crop surveys . 

£ ( i) 

The multiplicative terms, A K , were used to scale the signature 
covariances as well as the signature means. 
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TABLE 5. RECOGNITION USING UNTRANSFORMED AND MASC SIGNATURES 


Training 
Data Set 

Signature 

Transfor- 

mation 

Applied 

Recognition 
Data Set 

.. 

Center-Field-Pixel Recognition 








Correct Wheat 

Correct Other 

Fay, 11 June 

UT 

Fay, 10 June 

64.0% 


89.3% 

Fay, 11 June 

MASC 

Fay, 10 June 

93.0% 


84.2% 

Fay, 11 June 

UT 

She , 8 June 

41.5% 


95.9% 

Fay, 11 June 

MASC 

She, 8 June 

83.0% 


95.0% 




Correct 

Correct 

Correct 




Corn 

Soy 

Other 

White, 21 Aug 

UT 

Fay, 21 Aug 

1.7% 

10.0% 

70.9% 

White, 21 Aug 

MASC 

Fay, 21 Aug 

83.4% 

83.2% 

72.2% 
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6 

TYPE 2 SIGNATURE EXTENSION METHODS 

As described in section 1, a Type 2 method is one which requires the 
preprocessing of both the TDS and RDS . This preprocessing is performed in 
an attempt to remove the effects of variations in the relative measurement 
conditions of the two data sets. In this section we investigate two Type 2 
methods: Ratios of Spectral Bands, and RADIFF. While both methods had 

worked well with aircraft data in the past they failed to perform satis- 
factorily, in terms of recognition, on LANDSAT-1 data. The reasons for 
these failures are discussed. 

6.1 RATIOS OF SPECTRAL BANDS 

Ratios of Spectral Bands (Ratios) [5,6] is a Type 2 signature extension 
technique. It requires the preprocessing of every data point in both the 
training and recognition data sets. 

The preprocessing of the data consists of forming new channels which 
are the ratio of the scanner signals in two of the LANDSAT-1 bands. Because 
there are four MSS LANDSAT-1 bands one can form three independent ratio 
channels. As we will see, the usefulness of the Ratio method depends on 
two assumptions concerning the relative measurement conditions between the 
TDS and RDS: 

1) The variations in target reflectance, between the TDS and RDS, are 
systematic in the sense that a variation in one channel is matched 
by variations in the other channels . 

2) There is no path radiance term in the signals of either data set. 

A further restriction on the usefulness of the Ratio method is imposed by 
the fact that the separation between the target signatures is reduced when 

^^R. K. Vincent, G. S. Thomas and R. F. Nalepka, "Signature Extension 
Studies", ERIM Report No. 190100-26-T, July 1974. 

r^l 

J R. F. Nalepka and J. P. Morgenstern, "Signature Extension: An Approach 

to Operational Multispectral Surveys", ERIM Report No. 31650-152-T, March 1973. 
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the Ratio channels are formed. This may not be true for other data sets 
where interclass variation is large. 

As seen previously, the signal for target a in channel i is given by 


(1) - 


+L p 1 ) J- 


where is the system gain in channel i, is the total downward 

irradiance incident on the target, is the transmittance which 

effectively attenuates the reflected radiation from the target, is 

C( 

the reflectance of the target, and is the path radiance which has 

been scattered into the field of view from something other than the target, 
A ratio channel is formed from the signals in two of the MSS channels: 


R 


(ij) _ 


,(±) 

I 

a 

,a) 


,(i) 


p (i) E ( i ) T ( i ) + L <±> 


R 


(ij) _ 

al 


and 


R 


(ij) 

a 2 


" R (J) n (j)p,(j) T (j) 

+ 


P 

TDS with the subscript 1 and 

data sets we have: 


(l) (i) (l) (i) 

L (i) G (i) 

M al 111 

Pi 1 

(j) G (j) E (j) T (j) + 

L (j) G (j) 

M ccl 1 1 1 

pi 1 

P^G^E^T^ + 

L (i) G (i > 

a2 2 2 2 

p2 2 


(j) G (j) E (j) T (j) + L (j) G (j) 


a2 2 


P2 2 
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We now assumes that the variation in the terms is matched by 

a similar variation in i.e., 


and 


(i) (x) 

G 2 E 2 



(1 + 6) G 


(i) (i) (i) 
1 E 1 T 1 


G 2 E 2 T 2 


(1 + 6)G 


(j) ir (j) T (j) 

1 E 1 T 1 


Using this variation we can write 


R 


(i) (i) (i) (i) (i) 

(ij) = P a2 G 1 E 1 T 1 (1+&) + L p2 

“ 2 " pU) G p E pT< j) (l+6) + 

a2 111 p2 


Now making our two assumptions, namely; 


1) 


> (1) 

a2 

a2 


al 

,«> 

al 


2 ) 


T (i) = T (j) = T d) = T (i) 

L P 2 p2 L pl Pi 


= 0 


we find that 


R 


(i) (i) (i) (i) 

(il) = P «1 G 1 E 1 T 1 

a 2 (j) r (j)v(j) T (j) 

P al G 1 E 1 T 1 


- r, (ij ) 


= R 


al 


Thus the Ratio channels, if the above assumptions hold, will yield "universal" 
signatures in the sense that the crop signatures will be the same for both 
data sets 1 and 2. 
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If the assumptions of the Ratio method hold approximately then the 
method could prove useful for extending signatures. However, due to the 
spectral characteristics of the LANDSAT-1 scanner the RATIO method has 
not proven to be too successful with present satellite data. The LANDSAT-1 
bands are very broad and widely separated therefore the condition that any 
change in G^E^T^ is matched by a similar variation in g^E^T^ 
is probably not well satisfied. Also the shapes of the spectral curves 
for the various vegetative types are very similar in the LANDSAT channels. 
Most of the discriminatory information is contained in the relative magni- 
tudes of the signals. When Ratio channels are formed a good deal of the 
magnitude information is lost, while differences in spectral shape are 
emphasized . To see this quantitatively we look at the separation between 
signals for the various vegetative crops , For the four LANDSAT channels 
typical values for the separation are 


gW _ qO) 

a § 

l/2(S (i) + S^) 

a 3 


« 10 - 20 %, 


where S and are the signature means of wheat and grass for F6-11. For 

a 6 

the Ratio channels, however, the separation is much smaller: 


A 


R (ij) _ R (ij) 
(ij) = a 3 

"ag 


1/2 (rW) + 


Si 1-3%. 


Thus the ability to discriminate between crops a and 8 is reduced when Ratio 
channels are used. It should be noted that the Ratio technique has been found 
to be quite effective when other scanners, e.g. aircraft, are used. 

If one is to attempt to employ ratios for identifying different vegetative 

types then the channels to be ratioed must be chosen so that A^"^ is maximized. 

ccg 

A more rigorous method of choosing the channels is to compute the pairwise 
probability of misclassif ication , PPM, for all possible ratios and then choose 
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the best set of three ratios. Using this method for the F6-10 data set 
one obtains the ratio set 2/4, 3/1, 3/4. Using the first criterion the 
best set obtained is 2/4, 3/1, 3/2. A comparison of the results using 
these two ratio sets on the test fields of F6-.10 using the F6-10 signatures 
is shown in Table 6 . 

TABLB: 6. COMPARISON OF PPM AND MAX T w CRITERION 

a ,8 aB aS 


Ratio Set 

% Recognition of Wheat 

% Correct Other 

(2/4, 3/1, 3/4) 

52.6% 

93.0% 

(2/4, 3/1, 3/2) 

55.3% 

95.7% 


Thus the use of the criterion that a weighted sum £ 




be maximized 


ct, P 

(w , represents a weighting of the vegetative types to be distinguished) , 
while not rigorously justified, seems to be a useful method for choosing 
ratio channels . 

Using this ratio set to extend the F6-11 signatures to F6-10, (see 
Table 7), does not improve recognition results. It should be noted however 
that an optimum set of Ratios for one data set may be sub-optimum for a 
different data set. 


TABLE 7. RECOGNITION OF F6-10 TRAINING AND TEST FIELDS 


Signatures 

% Recognition of Wheat 

% Correct Other 

F6-11 UT 

64.0% 

89.3% 

F6-11 (2/4, 3/1, 3/2) 

64.0% 

88.2% 
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6 . 2 RADIFF* 

RADIFF (ratio of diffe rences) [7] is a Type II signature extension 
method. It provides a means of preprocessing the data such that, if certain 
assumptions hold, the three types of variation listed in the MASC section 
are eliminated from the data. 

From the results of calculating the path radiance using the ERIM 
Radiative Transfer Model it was found that the ratio of path radiance in 
adjacent channels was approximately constant. In order to take advantage 
of this fact the RADIFF method has been developed. 

We shall begin by deriving the equations that define the RADIFF trans- 
formation and then will point out the assumptions which are implicit in 
those equations. Starting with equation (1) for the signal recorded in 
channel i for a particular crop type, a, in data set 1: 

S<« - . (ID 

al 1 1 1 al 1 pi 

where again , T^ , and are, respectively, the gain, total 

downward irradiance, transmittance and path radiance for data set 1 in 
channel i and is the reflectance of crop a in channel i for data set 1. 

We now form a new channel: 


(i) _ (i+1) 1 

(i,i+l,i+2) _ al al i,i+l 
“1 = c(i+2) _ S (i+D K 1 

al al i+2 , i+1 

where 


( 12 ) 


K 


i , i+1 


(i) 


(i+1) 


Pi 


The general concept for RADIFF was developed under the name DIFF/DIFF. 
See reference [7]. 

^^R. F. Nalepka and J. P. Morgenstern, "Signature Extension: An Approach 
to Operational Multispectral Surveys", ERIM Report No. 31650-152-T, March 1973, 
p . 36 . 
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Using equation (11) in equation (12) we find that 


c 1 


><« 

al 


c , (i,i+l,i+2) = 
‘‘al 


i,i+l (i+1) 

P al 


K 1 . +1 
i,x+l 


c 1 


(i+2) 

al 


(13) 


-K 1 


i+2, i+1 (i+1) i+2, i+1 

P al 


where 


c 1 , +1 

X ,1+1 


(i) (i) (i) 

G 1 E 1 T 1 
Ji+DJi+D-d+l) 
G 1 E l T l 


In deriving equation (13) we have assumed that the gain factor is the 
same for each channel, i.e., 


,(i) _ P ( i+D = G ( i+2 ) 


etc , 


We now make the additional assumptions that, 

1) Any variation in the path radiance in one channel is matched 
in the adjacent channel. Thus: we assume that the ratio 


K 


i ,i+l 


L<« 

= 


(i+1) 


Pi 


is independent of the particular atmospheric state, i.e., 
the ratio of path radiances can be written as . 


2) In the same way any variation in the product is 

matched in the adjacent channel, thus the factor C'f . 

1 ; 

becomes C . and is independent of atmospheric state, 
l , l+l 
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If these assumptions hold then the new RADIFF channel will be independent 
of atmospheric state. If we further assume that we are interested in 
extending signatures to data sets for which the crop reflectances do not 
vary, then the RADIFF channel should have the same value for each crop 
in all data sets: 

(i,i+l,i+2) = s (i,i+l,i+2) _ _ g (i,i+l,i+2) _ g (i,i+l,i+2) 

al u2 an a 


If all of the above assumptions hold then the RADIFF transformation 
will yield universal crop signatures. The degree to which the universality 
of the RADIFF signatures fails is a reflection of the limited degree to 
which the assumptions are satisfied. In order to form the RADIFF channels 
(equation (12)) we must calculate the values for the K. , at the 

1 y 1 ' X 

same time we can test the assumption that they are independent of atmospheric 

state. The independence of the CL could be examined in the same way 

but this has not as yet been done. For purposes of using the RADIFF channels 

(1 2 3) (2 3 4) 

we wish in particular to form S v ’ ’ and S . We thus must calculate 


The ERIM Radiative Transfer model was used to calcu- 
late the path radiance for a number of atmospheric states, as described by 


K l,2> K 3,4- and K 2,3 


visibility values (see Fig. 5). These values for were then integrated 
over simulated LANDSAT bands as shown in Fig. 6. The results of the inte- 
gration are given in Table 8. Finally the ratios K _, _ , and , were 

-L y A A y J J )4 

formed and are plotted versus visibility in Fig. 7. Also shown in Fig. 7 
are the averages of the ratios over visibility and the maximum variation 
from this average. As can be seen, the assumption that the K's are constant 
over atmospheric state is correct within approximately 10%. 
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FIGURE 5. 


PATH RADIANCE (ra watts/cm"/steradian/iJni) VS. WAVELENGTH. 
Alt. = 910 km. Solar Zenith = 62°, Green Vegetation 
Target on Green Vegetation Background. (Calculation 
based on ERIM Radiative Transfer Model) . 

Visibility = 5 km, 10 km, 15 km, 20 km, 25 km. 
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FIGURE 6. RELATIVE RESPONSE OF LAND SAT-1 VS. WAVELENGTH. 

Actual LANDS AT- 1 

Simulation for Integration — — 
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TABLE 8. PATH RADIANCE INTEGRATED OVER LANDSAT-1 BANDS 



The channels and were formed for both the F6-11 

and F6-10 data sets. (Note: the latter channel was formed in that particular 

way to insure that it was positive and greater than 1.0.) The values used 
for the K’s were the average values as shown in Fig. 7. Training was then 
accomplished using the training fields of F6-11 and these signatures were 
then used to perform recognition on the F6-10 data set. The results are 
listed in Table 9. 


TABLE 9. RECOGNITION OF F6-10 USING F6-11 RADIFF SIGNATURES 


Center Field Pixel Recognition 


Correct Wheat 

Correct Other 

64.0% 

85.4% 
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These results are not as good as the MASC or ASC results. At this 
point it is not possible to say if the poor results are due to the failure 
of the basic assumptions to not be satisfied exactly. It may be necessary 
to recalculate the K's with better approximations to the response functions 
of the LANDSAT-1 scanner. It may also be necessary to restrict the limits 
of applicability of the model so that the assumptions are more nearly 
obtained. It is also possible that in forming the RADIFF channels, i.e., 
preprocessing the data, the information content of the signals may be reduced. 
This in turn could be either due to round-off errors in the calculation of 
the RADIFF channels or it may be inherent in the nature of the transform. 
Further investigations are required to answer these questions. 
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7 

EXPERIMENT TO DETERMINE EFFECT OF ATMOSPHERIC-GEOMETRIC 
VARIATIONS ON RECOGNITION 


As discussed in a previous section, there are a number of variables 
which could lead to changes in the target signatures when going from one 
data set to another. One of the possible sources of variation results 
from a change in sensor gain. Since this is strictly an instrumental 
variation, and our experiment will involve two data sets taken only one 
day apart, we will assume that the gain is a constant. It is, of course, 
difficult to determine if, in fact, this is the case, but, in light of the 
calibration methods [8] , it seems to be a good assumption. 

For the purpose of this experiment the other sources of variation can 
be considered to fall into two classes. The first class is essentially com- 
posed of on the ground variations. These include changes in soil type, and 
moisture content, cultural practices (irrigation and fertilization), and 
changes in crop maturity. The second class consists of variations in sun 
angle (time of day), atmospheric profile (optical depth, aerosol content, 
etc.), and scanner view angle. The question which this experiment attempts 
to answer is: what is the effect on recognition of variations in only 

atmospheric profile and scanner view angle? This question is of importance 
because the current LACIE approach may not adequately correct for these 
variables . These two variables will result in both additive and multipli- 
cative changes while the MLA (mean level adjustment) yields only an additive 
correction. 

In order to answer the question we have posed it is necessary to find 
two data sets for which, to the best of our knowledge, the only variables 
are atmosphere and scanner view angle. Fortunately in the CITARS study two 
such data sets were available. These were the Fayette June 10 and June 11 


r q 1 

J ERTS Data Users Handbook , NASA Document No. 71SD4249, Appendix G. 
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data sets (F6-10 and F6-11) . In particular we will consider only the 
training fields which were identified in the CITARS study. Thus we have 
two data sets composed of exactly the same fields; the only difference 
between the two sets is that they were collected on successive days. 
Obviously, since we are looking at the same fields only one day apart 
(there was no rainfall between the collection of the data sets) , factors 
such as soil type, and moisture content can be assumed constant. Further, 
since both data sets were collected at the same time of day the sun angle 
is not a variable. 

To see that the state of maturity of the wheat was constant we plot 
mean signature values obtained from both F6-10 and F6-11 in Fig. 8, The 
upper line for each target is the mean signature value from F6-10 and the 
lower lines are the mean signature values from F6-11. As can be seen 
there is no substantial change in the wheat signature going from F6-10 to 
F6-11 which is not reflected in the change in the signature for trees. 

The weed signature shows a similar variation. For the June period, a change 
in maturity for wheat would primarily be due to "browning", i.e., loss of 
chlorophyll. This would in turn result in an increase in reflectance in 
channel 2. In fact, however, the signature for wheat in channel 2 for 
F6-11 is lower than for F6-10 . We can therefore assume from this analysis 
that the state of maturity of wheat is not a variable when going from F6-11 
to F6-10. 

Thus the primary variables are atmosphere and scanner view angle. 

While on the ground horizontal visibility readings were the same for both 
June 10 and June 11 at nearby airports, there was one obvious difference 
in the atmospheres for June 10 and June 11 in the imagery of the respective 
LANDSAT frames. While the June 10 frame was clear of all clouds there were 
some small cumulus clouds in the June 11 frame. Neither these clouds, nor 
their shadows , covered any of the training fields. 
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Because the data were collected 24 hours apart there was a small 
difference in scanner view angle (see Fig. 9). For the data taken on 
the 10^ the view angle was approximately 2°50' west of nadir while on 
the 11^ it was approximately 3°40’ east of nadir. Thus there was more 
than a 6° difference in view angle. 

In order to test the effect of variations in atmosphere and scanner 
view angle on recognition we derive our training statistics from all of 
the training fields from F6-11 . These signatures are then used for recog- 
nition on the very same fields for both F6-11 and F6-10. Obviously if the 
variations in atmosphere and view angle do not affect recognition accuracy 
then the results should be approximately the same for both F6-11 and F6-10. 
As can be seen in Table 10 there is a clear reduction in recognition 
accuracy when signatures from F6-11 are applied to F6-10. 


TABLE 10. RECOGNITION RESULTS OF FAY 6-11 AND FAY 6-10 
TRAINING FIELDS USING F6-11 SIGNATURES 


Recognition 
Data Set 

Central Field 

Recognition 


Correct Wheat 

Correct Other 

Fay 6-11, Training 

91.6% 


97.2% 

Fay 6-10, Training 

72.9% 


97.7% 


We see., therefore, that variations in atmosphere and scanner view 
angle alone can seriously affect recognition. As discussed in the previous 
section the MASC algorithm has the capability of correcting for these varia- 
tions, as well as other possible variations. In Table 11 we give the results 
of applying the MASC algorithm to the F6-11 signatures and then using them 
to perform recognition on the F6-10 training fields. 
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RELATIVE GEOMETRY OF DATA COLLECTION FOR FAYETTE, 
JUNE 10 (F6-10) AND FAYETTE, JUNE 11 (F6-11) . 
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TABLE 11. RECOGNITION OF F6-10 TRAINING FIELDS USING 
MASC TRANSFORMED SIGNATURES FROM F6-11 


Data Set 

Central Field Recognition 

F 6-10 Training 

Correct Wheat 

Correct Other 

100% 

94.3% 


Because; of the nature of the data collection it is difficult to separate 
the effects of atmospheric state and scanner view angle. This experiment 
clearly demonstrates, however, that one or both of these effects can have a 
real impact when recognition with extended signatures is attempted. In the 
future we hope to be able to separate these effects by using atmospheric 
and canopy models in conjunction with real data and the MASC algorithm. 

It should be noted that the results of the CITARS study have demonstrated 
that there is a direct correlation between the degradation of non-local 
recognition (using untransformed signatures) and the difference in optical 
depth between the TDS and RDS [9] . 


[9] 


Personal communication from R. M. Bizzell. 
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DETERMINING TRAINING FIELDS WITHOUT IN SITU GROUND INFORMATION 

Signature extension is one approach to reducing the large amounts of 
ground information required for operational crop surveys. Another approach 
which may prove fruitful is to attempt to determine training sites, for 
each data set, without the use of in situ ground information. In this 
section we will describe some initial attempts to attack the problem in 
this way. 

Our approach is based on the assumption that regions of multi- 
dimensional MSS data space can be defined such that each region contains 
the MSS data for a single spectral vegetation class. In addition it is 
assumed that each region is uniquely defined for all data sets in terms 
of its relationship to every other region of the space. If these assump- 
tions are to hold then it is necessary that the same crops exist in both 
data sets . The maturity of the various crops should be approximately the 
same for both data sets. 

Rather than examine the entire LANDSAT-1 four-dimensional data space 
we will deal only with the sub-space of channels 2 and 3 . This reduction 
of the space causes only a slight loss of information since there is a 
high degree of correlation between channels 1 and 2 and between channels 3 
and 4 (see Figures 10 and 11) , In order to visualize the pattern formed 
by the data in our subspace we cluster over the data set and plot two- 
dimensional (channels 2 and 3) representations of the clusters. Each 
cluster is represented by a one standard deviation ellipse. The cluster 
mean is located within each ellipse by a point. The clusters are labeled, 
for identification purposes, by the first two digits (see Figure 12). The 
second two digits represent the percentage of all the points clustered over 
which are included in the cluster. Percentages less than 1% are represented 
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by double zeros while all other percentages are rounded off to the nearest 
integer percentage. 

For purposes of displaying the general pattern, clustering was per- 
formed over the quarter sections which contained the training fields 
identified by the CITARS project. This subset of the entire data set 
was chosen to save computation time. A better method may be to sample 
over the entire data set. 

The resulting cluster plots for F6-11 and S6-8 are shown in Figures 
12 and 13. Refering to Figure 12 we see that the general pattern is 
triangular. The vertices of the triangle being clusters 39, 29 and 30. 

This form turns out to be quite general for agricultural data sets. The 
side of the triangle extending from 39 to 29 represents a progression of bare 

soil types from darker to lighter soils. The sides from 29 to 30 and 39 
to 30 represent variations in such scene parameters as percent vegetation 
cover, plant geometry, leaf structure, etc. coupled with the soil effects. 

For a more detailed interpretation of the general cluster pattern, see 
Appendix III. If we could identify a region of this triangular pattern 
as belonging to a particular crop type then by mapping the triangle from 
F6-11 into the triangular pattern for S6-8 we would obtain a mapping of 
the single crop region from F6-11 to S6-8. The clusters which fall within 
this crop region of S6-8 could then be used to identify fields for training 
on that crop. The clusters within the crop region of S6-8 may also be used 
to perform recognition on S6-8 . 

Two methods were used to map the triangular pattern from F6-11 into 
the triangular pattern of S6-8. These were the Overlay Method and the 
Method of Affine Transformations (MAT) . In addition to these two methods 
the MASC algorithm may prove useful in the future. The MASC algorithm is 
a restricted type of affine transformation. 
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FIGURE 13. GENERAL CLUSTER PATTERN FOR S6-8 IN CHANNELS 3 AND 2 . 

The dot represents the location of the S6-8 tree 
cluster. X and ^ are the positions of the tree 
cluster transferred from F6-11 by the Overlay and 
MAT methods, respectively. 
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The Overlay Method consists of physically overlaying the cluster 
plot of F6-11 on top of the cluster plot of S6-8. The F6-11 plot is 
then adjusted, by translations and rotations, until a "best fit" of the 
two triangular patterns is obtained. This of course involves the 
judgement of the analyst. 

The Method of Affine Transformations consists of choosing the means 
of the three vertex clusters of the two triangular patterns to define a 
general affine transformation. Thus the cluster means of clusters 39, 

29 and 30 define three points in the F6-11 space while clusters 13, 27 
and 25 are used to define the equivalent points in the S6-8 space. These 
two sets of points are then used to derive a transformation matrix which 
allows a mapping from F6-11 to S6-8 as described below. 

The affine transformation can be written as 

[M] = A[N] , (14) 

where A is the matrix which transforms the space N into the space M. For 
our purposes the space N corresponds to F6-11 while M corresponds to S6-8. 
Since we are working with two dimensions we will define our spaces by two 
vectors m^,^ anc * We define these vectors as: 

m = Cluster 29 - Cluster 39 

(! 5 ) 

m 2 = Cluster 30 - Cluster 39 

and 

n = Cluster 27 - Cluster 13 

1 (16) 
n 2 = Cluster 25 - Cluster 13 
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What we have done in equations (15) and (16) is translate the origin 
in the F6-11 plot to cluster 39 and the origin in S6-8 to cluster 13 . 

In this way we are building into the transformation matrix. A, a trans- 
lation. We have reduced the spaces M and N to two x two matrices so that 
equation (14) becomes 


M = A N 

'V % 


which can easily be solved for A, formally. 


A = M N 

% 'C % 


The transformation matrix for going from F6-11 to S6-8, derived in 
the above manner is 


.801 -.178 

-.208 .880 


The diagonal elements of correspond to the multiplicative constants of 
the MASC algorithm. In fact the MASC multiplicative constants can be 
written in matrix form (for the transformation from F6-11 ■+ S6-8) as 

.902 0 

0 .652 



The fundamental difference between a linear transformation of the MASC type 
and a general affine transformation is the exclusion of rotations of the axis. 
This rotation is represented by the non-zero off-diagonal matrix elements. 

In the case of MSS data, where the axis are spectral channels, the non-zero 
off-diagonal elements can be interpreted as resulting from some dissimilarity 
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between the two data sets. This dissimilarity may be due to some 
different crop types or to different reflectances for some of the crop 
types. Another possibility is that the use of only three points to 
define the transformation is not precise enough so that the off-diagonal 
elements are "accidently" non-zero. 

In order to test the effectiveness of the two methods they will be 
used to locate the position of trees in the S6-8 pattern space. This 
object class was chosen because there were relatively few clusters repre- 
senting it and because trees were known to be rather distinct, spectrally, 
from most other object classes. This pattern may be observed by comparing 
Figure 14 with Figure 12 and Figure 15 with Figure 13. Three clusters 
were obtained for trees for F6-11 and one cluster for S6-8 . Of the F6-11 
clusters, cluster 1 contains the majority of the pixels. The two methods 
will be tested by how close they are able to map the tree cluster 1 from 
F6-11 onto the tree cluster of S6-8 . This mapping is shown on Figure 13 
where the "X" locates the mapping as obtained using the Overlay Method. 

The mapping using the MAT is located by the The actual position of 

the S6-8 tree cluster is located by the dot. As can be seen from Figure 13, 
the Overlay Method came closer to mapping the F6-11 tree cluster onto the 
S6-8 tree cluster. It should also be noted, however, that the tree cluster 
of S6-8 does not fall within the pattern formed by clustering over the 
quarter sections. This seems to support the idea of increasing the size 
of the data sample operated on by the clustering algorithm. A larger 
sample size would increase the probability of including the wide range of 
soils and soil covers probable in any data set. 

The actual mapping of crop regions and the use of those crop regions 
to define clusters which can be used to identify training fields has not, 
as yet, been attempted. In the future, the further development of these 
methods may prove valuable for the location of training fields without in 
situ ground information. 
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CONCLUSIONS AND RECOMMENDATIONS 

We have shown that the use of untransformed signatures from a TDS , 
when applied to a temporally-spatially removed RDS, yield poor recognition 
results. We have investigated the sources of data variability which are 
responsible for the degraded recognition results. From this investigation 
four signature extension methods have suggested themselves. 

In Fig. 16 we display the F6-10 recognition results using untrans- 
formed (UT) F6-11 signatures as well as F6-11 signatures modified by our 
four methods: Ratio, RADIFF, ASC and MASC . In Fig. 17 and Fig. 18 we 

display the recognition results using UT, MASC and ASC signatures from 
F6-11 on S6-S and from W8-21 on F8-21. We see that the ASC and MASC methods 
are quite successful — the MASC method showing significant improvement in 
recognition in all three cases. In addition, if we plot the average probabilities 
of misclassif ication, Fig. 19, we see that the MASC algorithm is fairly constant 
in its performance. The UT signatures result in more variation in performance. 
Since the variations between respective TDS and RDS are random, the relative 
constancy of the average probability of misclassif ication implies that the 
MASC algorithm is indeed capable of correcting for those variations. 

The MASC algorithm, in particular, may prove helpful in further 
isolating the physical factors which are the cause of the variations in 
data between the TDS and RDS. For instance the ERIM Radiative Transfer 
model can be used to calculate the multiplicative and additive constants 
based on equations (8) and (9) with the added assumptions that the 
atmospheric state is the only variable. In Figures 20 and 21 we have 
plotted the multiplicative and additive constants based on such a calcula- 
tion and as were derived using the MASC algorithm. While the values can not 

be expected to match exactly the curves should have similar shapes if we 
have not neglected an important source of variation. As seen in Figures 20 
and 21 the shapes of the MASC curves and the model curves are quite similar. 

The one exception is between channels one and two for the multiplicative 
constant. The additive constant depends both on a correct calculation of the 
path radiance and of the multiplicative constant. The large differences in 
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FIGURE 16. RESULTS OF RECOGNITION EXPERIMENT ON F6-10 USING VARIOUS TRANSFORMATIONS 
ON THE F6-11 SIGNATURES. The striped bar, is the percentage correct wheat 
recognition. The open bar is percentage other correct. 




FIGURE 17. RESULTS OF RECOGNITION EXPERIMENT ON S6-8 USING ASC AND MASC 
TRANSFORMED F6-11 SIGNATURES. The striped bar is the per- 
centage correct wheat while the open bar is the percentage 
other correct. 
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FIGURE 18. RESULTS OF RECOGNITION EXPERIMENT ON F8-21 USING ASC AND MASC 
TRANSFORMED W8-21 SIGNATURES. The solid bar is percentage 
correct corn recognition while the striped bar is percentage 
soy correct. The open bar is percentage other correct. 
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FIGURE 20. MULTIPLICATIVE SIGNATURE CORRECTION FIGURE 21. ADDITIVE SIGNATURE CORRECTION 

CALCULATED FROM ERIM RADIATIVE FOR F6-11 -+ F6-10 

TRANSFER MODEL AND FROM MASC FOR 
THE TRANSFORMATION F6-11 * F6-10. 

Ratios are assumed visibility (km) 
for F6-11 to visibility for F6-10. 
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magnitude apparent in Fig. 21 may be due to not having the correct values 
of calculated by the model. This anomaly could very possibly be due 

to the fact that visibility was used as an input to the model rather than 
the more exact optical thickness. Also the model calculations are not 
exact since the parameter values at the middle of the LANDSAT-1 bands 
were used rather than integrating over the bands. More investigation into 
such questions as this may prove fruitful in the future. 

It should be noted that none of the signature extension algorithms 
presented here should be considered to be in their final form. For 
instance, other methods of forming correspondences between clusters in 
the MASC algorithm are possible. When these methods are examined it may 
be possible to devise an improved version of MASC. 

We have shown that similarities exist between cluster patterns based 
on spatially separated data sites . Two methods were described which 
allowed the cluster patterns from different data sets to be numerically 
compared. These methods were also used to transfer information between 
cluster patterns. The location of one object class was transferred 
between a pair of cluster patterns with reasonable results. 

In the future recognition should be used to evaluate the accuracy of 
the methods used to map crop regions from one cluster pattern to another. 
In addition, phenological models of various vegetative spectral classes 
should be developed. 
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APPENDIX I: MASC ALGORITHM 


We present here a step-by-step guide and flow chart for the MASC 

algorithm. 

STEP 1. Both the extended from (Set 1) and extended to (set 2) data 
sets are input. 

STEP 2. Unsupervised clustering is performed on both data sets. All 
input parameters to the cluster program should be the same 
for both data sets. 

STEP 3. All clusters containing less than 1% of all pixels are removed 
from consideration. 


STEP 4. 


STEP 5. 


STEP 6. 


STEP 7. 


STEP 8. 


After Step 3 above there are clusters from data set 1 and 


N 2 clusters from data set 2. 


The minimum of and is 


chosen so that a one-one correspondence between the two cluster 
sets is possible. 

The channel containing the largest range of values is chosen for 
ordering clusters. 

Both cluster sets are ordered on basis of their mean values in 
the above mentioned channel. The cluster in data set 1 with the 
largest mean in the selected channel is labeled number one, 
the cluster with the second largest mean is labeled number two, 
etc. The same ordering procedure is applied to data set 2. 

(Note: By cluster mean we intend the mean of the Gaussian 

distribution which represents the cluster.) 

A one-to-one correspondence is formed between the first N clusters 
of the data sets (N = min^,^)) . The means of the corresponded 
clusters define points in two space. 

Regression is used with the set of points from Step 7 to yield 
the parameters to the equation 

= A (l) + B (l) , 


where the subscripts represent data sets 1 and 2. 
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STEPS 9 
and 10. 


STEP 11. 


Any points whose percentage deviation, in any channel, from 
the line of equation (1) is greater than 10% are removed and 
regression is re-entered. 

The parameters and which result from the regression 

are used as multiplicative (A^) and additive (B^) signature 
corrections for the signatures from data set 1. Thus the 
signatures from data set 1 can be extended to data set 2. 
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APPENDIX II 
MASC PARAMETERS 

The multiplicative, A^^ , and additive, , MASC parameters used 

on the Cl TARS data sets are listed in Table A1 below. 


TABLE Al. MASC PARAMETERS USED FOR TDS TO RDS TRANSFORMATIONS 


Training 
Data Set 

Recognition 
Data Set 

Channel (i) 

A<*> 

B (i) 








1 

1.201 

-5.308 


Fayette 

2 

1.212 

-3.242 


June 10 

3 

1.185 

-4.729 

Fayette 


4 

1.139 

- .997 

June 11 


1 

.794 

8.665 


Shelby 

2 

.902 

3.575 


June 8 

3 

.652 

17.711 



4 

.605 

9.688 



1 

2.15 

-22.449 

White 

Fayette 

2 

2.23 

-12.841 

August 21 

August 21 

3 

.78 

13.156 



4 

.87 

2.488 
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APPENDIX III 

GENERAL CLUSTER PATTERNS FOR AGRICULTURAL SCENES 

In order to achieve a better understanding of just what is portrayed 
in the cluster patterns and why a general or ’complete’ cluster pattern 
has the shape it does, ERIM's vegetative canopy model [10] was called into 
play. As it happened, the necessary model inputs for a certain type of 
vegetation, Ionia wheat (a variety grown in Michigan) were readily availa- 
ble. And so, two soil reflectances were selected, one to simulate a darker, 
perhaps more organic or moist soil and the other to simulate a lighter 
colored, perhaps sandier or drier soil (for more information of the importance 
of soil moisture on soil reflectance, see Blanchard et al., 1974 [11] and 
Parks et al., 1974 [12]) and a construction made of the phenology of a sample 
of wheat, Ionia variety with two very different soil backgrounds (See Figure Al) . 
As may be seen, the soil background plays a dominant role in the bidirectional 
reflectance of a stand of Ionia wheat until the onset of plant maturity. If 
the bare soil points are connected by a line, hereafter called the bare soil 
line, the outline of the phenology of Ionia wheat is very similar to the 
outline of the 'complete* cluster pattern. It is not unreasonable to suspect, 
therefore, that location within a cluster pattern represents, to a degree, 
vegetative state of development as modified by such factors as soil reflectance, 
stress of various kinds, mixtures of vegetation and so on. As an actual example 

^^G. Suits, "The Calculation of the Directional Reflectance of a Vege- 
tative Canopy", Remote Sensing of Environment , V. 2, 1972, pp. 117-125. 

^^M. B. Blanchard, R. Greeley and R. Goettelman, "Use of Visible, Near- 
Infrared, and Thermal Infrared Remote Sensing to Study Soil Moisture", Pro- 
ceedi ngs of Nin th International Symposium on Remote Sensing of Environment , 

Ann Arbor, Michigan, April 1974. 

l , Parks, J. I. Sewell, J. W. Hilty and J. C. Rennie, "Utilizing 
ERTS Imagery to Detect Plant Diseases and Nutrient Deficiencies, Soil Types 
and Soil Moisture Levels", Report No. NAS5-21873, NASA/GSFC, March 1974. 
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of the extreme variability present in the reflectance characteristics of 
a crop such as soybeans (varieties unknown) at the emergence stage see 
the cluster plots for soybeans based on the Fayette 16 July and Livingston 
16 July data sets (Figures A2 through A5) . By overlaying the Fayette soy- 
bean cluster plot (Figure A2) onto the cluster plot based on Fayette 
quarter sections (Figure A3) the fact emerges that soybeans in Fayette Co. 
were planted in soils on the upper (brighter) half of the bare soil line. 
When one follows the same procedure for Livingston Co. (Figures A4 and A5 
respectively) one sees that soybeans were planted in soils on the lower 
(darker) half of the bare soil line. The important soybean clusters (with 
most of the points) are 1, 2, 3, 5 and 8 for Fayette Co. and 1, 2, 3, 4, 5 
and 7 for Livingston Co. Analysis of a time sequence of plots such as 
these for a variety of vegetative types can aid in predicting where in the 
cluster pattern a certain crop should be (allowing for the sources of 
variability discussed previously) at a certain point in its crop calendar. 
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APPENDIX IV 

ABBREVIATIONS USED IN THIS REPORT 


ASC 

- 

Additive Signature Correction 

MASC 

- 

Multiplicative and Additive Signature Correction 

MAT 

- 

Method of Affine Transformations 

MLA 

- 

Mean Level Adjustment 

MSS 

- 

Multispectral Scanner 

RADIFF 

- 

Ratio of Differences In Spectral Bands 

Ratio 

- 

Ratio of Spectral Bands 

RDS 

- 

Recognition Data Set 

TDS 

- 

Training Data Set 

Data Sets 


F6-10 

- 

Fayette Co., Illinois, June 10, 1973 

F6-11 

- 

Fayette Co., Illinois, June 11, 1973 

S6-8 

- 

Shelby Co., Indiana, June 8, 1973 

F8-21 

- 

Fayette Co., Illinois, August 21, 1973 

W8-21 

- 

White Co., Indiana, August 21, 1973 
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